CyanoLyase: a database of phycobilin lyase sequences, motifs and functions
نویسندگان
چکیده
CyanoLyase (http://cyanolyase.genouest.org/) is a manually curated sequence and motif database of phycobilin lyases and related proteins. These enzymes catalyze the covalent ligation of chromophores (phycobilins) to specific binding sites of phycobiliproteins (PBPs). The latter constitute the building bricks of phycobilisomes, the major light-harvesting systems of cyanobacteria and red algae. Phycobilin lyases sequences are poorly annotated in public databases. Sequences included in CyanoLyase were retrieved from all available genomes of these organisms and a few others by similarity searches using biochemically characterized enzyme sequences and then classified into 3 clans and 32 families. Amino acid motifs were computed for each family using Protomata learner. CyanoLyase also includes BLAST and a novel pattern matching tool (Protomatch) that allow users to rapidly retrieve and annotate lyases from any new genome. In addition, it provides phylogenetic analyses of all phycobilin lyases families, describes their function, their presence/absence in all genomes of the database (phyletic profiles) and predicts the chromophorylation of PBPs in each strain. The site also includes a thorough bibliography about phycobilin lyases and genomes included in the database. This resource should be useful to scientists and companies interested in natural or artificial PBPs, which have a number of biotechnological applications, notably as fluorescent markers.
منابع مشابه
Functional motifs in Escherichia coli NC101
Escherichia coli (E. coli) bacteria can damage DNA of the gut lining cells and may encourage the development of colon cancer according to recent reports. Genetic switches are specific sequence motifs and many of them are drug targets. It is interesting to know motifs and their location in sequences. At the present study, Gibbs sampler algorithm was used in order to predict and find functional m...
متن کاملMining New Motifs from Cdna Sequence Data
General biological databases that store basic information on genome, transcriptome, and proteome are indispensable sequence discovery resources. However, they are not necessarily useful for inferring functions of proteins. To see this, we observe that SWISS-PROT —a protein knowledgebase containing curated protein sequences and functional information on domains and diseases—has grown a mere 26-f...
متن کاملThe roles of EPIYA sequence to perturb the cellular signaling pathways and cancer risk
Abstract It was shown that several pathogenic bacterial effector proteins contain the Glu-Pro-Ile-Tyr-Ala (EPIYA) or a similar sequence. These bacterial EPIYA effectors are delivered into host cell via type III or IV secretion system, where they undergo tyrosine phosphorylation at the EPIYA sequences, which triggers interaction with multiple host cell SH2 domain-containing proteins and thereby...
متن کاملAutomated Discovery of Protein Motifs With Genetic Programming
Automated methods of machine learning may prove to be useful in discovering biologically meaningful information hidden in the rapidly growing databases of DNA sequences and protein sequences. Genetic programming is an extension of the genetic algorithm in which a population of computer programs is bred, over a series of generations, in order to solve a problem. Genetic programming is capable of...
متن کاملComputing motif correlations in proteins
Protein motifs, which are specific regions and conserved regions, are found by comparing multiple protein sequences. These conserved regions in general play an important role in protein functions and protein folds, for example, for their binding properties or enzymatic activities. The aim here is to find the existence correlations of protein motifs. The knowledge of protein motif/domain sharing...
متن کامل